Human T cell states after checkpoint blockade
In this case study, we will use ProjecTILs to interpret human scRNA-seq T cell data in the context of a murine TIL atlas. We are going to use the single-cell dataset by Sade-Feldman GSE120575 to illustrate interspecies mapping of human cells on a stable murine atlas using gene orthologs.
Background
In the study by Sade-Feldman et al. (2018) Cell, the authours characterize the transcriptomics profile of immune cells from melanoma patients treated with checkpoint inhibitors, with the goal to identify factors that associate with success or failure of immune checkpoint therapy. They find that certain CD8+ T cell states associate with clinical outcome, and in particular that the transcription factor TCF7 is a significant marker for response to anti-PD-1 therapy. This suggests that the state of T cells found within a tumor is critical for induction of effective tumor immunity, highlighting the importance of characterizing (and potentially intervening on) the diversity of tumor-infiltraring T cells.
The single-cell expression data (GSE120575) consists of CD45+ single cells from 48 tumor samples of melanoma patients treated with checkpoint inhibitors, and sequenced by Smart-seq2 technology. Meta-data on response to therapy (Responder vs. Non-responder) is also available on the same GEO identifier.
Please use the development version of ProjecTILs >0.4.1, which implements ortholog gene conversion between human and mouse.
R Environment
Check & load R packages
Sys.setenv(R_REMOTES_NO_ERRORS_FROM_WARNINGS = "true")
if (!requireNamespace("renv")) install.packages("renv")
library(renv)
renv::restore()
# Load development version of ProjecTILs
remotes::install_git("https://gitlab.unil.ch/carmona/ProjecTILs.git", ref = "v0.4.1")
library(ProjecTILs)
library(gridExtra)scRNA-seq data preparation
Download the count matrix and metadata from Gene Expression Omnibus (GEO), and store as Seurat object.
cached.object <- "SadeFeldman.seurat.rds"
if (!file.exists(cached.object)) {
library(GEOquery)
geo_acc <- "GSE120575"
datadir <- "input/SadeFeldman"
gse <- getGEO(geo_acc)
series <- paste0(geo_acc, "_series_matrix.txt.gz")
system(paste0("mkdir -p ", datadir))
getGEOSuppFiles(geo_acc, baseDir = datadir)
## Load expression matrix and metadata
exp.mat <- read.delim(sprintf("%s/%s/GSE120575_Sade_Feldman_melanoma_single_cells_TPM_GEO.txt.gz",
datadir, geo_acc), header = F, sep = "\t")
genes <- exp.mat[c(-1, -2), 1]
cells <- as.vector(t(exp.mat[1, 2:16292]))
patient <- as.factor(t(exp.mat[2, 2:16292]))
exp.mat <- exp.mat[c(-1, -2), 2:16292]
colnames(exp.mat) <- cells
rownames(exp.mat) <- genes
meta <- read.delim(sprintf("%s/%s/GSE120575_patient_ID_single_cells.txt.gz",
datadir, geo_acc), header = T, sep = "\t", skip = 19, nrows = 16291)
meta <- meta[, 1:7]
treat <- factor(ifelse(grepl("Post", patient), "Post", "Pre"))
response <- factor(meta$characteristics..response)
therapy <- factor(meta$characteristics..therapy)
## Create Seurat object and add meta data
query.object <- CreateSeuratObject(counts = exp.mat, project = "SadeFeldman",
min.cells = 10, min.features = 50)
query.object@meta.data$Patient <- patient
query.object@meta.data$Time <- treat
query.object@meta.data$Response <- response
query.object@meta.data$Therapy <- therapy
saveRDS(query.object, file = cached.object)
} else {
query.object <- readRDS(cached.object)
}Some basic statistics - cells per group (Pre vs. Post, Therapy, Responder vs. Non-responder)
Post Pre
10363 5928
anti-CTLA4 anti-CTLA4+PD1 anti-PD1
517 4121 11653
Non-responder Responder
10727 5564
ProjecTILs
Load reference TIL atlas - if it’s not present in the working directory, it will be downloaded from the repository
[1] "Loading Default Reference Atlas..."
[1] "/Users/mass/Documents/Projects/Github/ProjecTILs_CaseStudies/ref_TILAtlas_mouse_v1.rds"
[1] "Loaded Reference map ref_TILAtlas_mouse_v1"
Run projection algorithm - version 0.4.1 supports human ortholog conversion (human.ortho=T)
[1] "Using assay RNA for query"
[1] "4862 out of 16291 ( 30 % ) non-pure T cells removed. Use filter.cells=FALSE to avoid pre-filtering (NOT RECOMMENDED)"
[1] "Transforming expression matrix into space of mouse orthologs"
[1] "Aligning query to reference map for batch-correction..."
Projecting corrected query onto Reference PCA space
[1] "Projecting corrected query onto Reference UMAP space"
Plot global projection of new data over the reference in UMAP space
Predict the cell states in the query set
query.projected <- cellstate.predict(ref = ref, query = query.projected)
table(query.projected$functional.cluster)
CD4_NaiveLike CD8_EarlyActiv CD8_EffectorMemory CD8_NaiveLike
187 1181 2153 2119
CD8_Tex CD8_Tpex Tfh Th1
1594 675 922 1523
Treg
1075
The expression profiles or some key genes correspond fairly well to those of the reference
query.list <- SplitObject(query.projected, split.by = "Response")
plot.states.radar(ref, query = query.list, min.cells = 50, genes4radar = c("Foxp3",
"Cd4", "Cd8a", "Tcf7", "Ccr7", "Gzmb", "Pdcd1", "Havcr2", "Tox", "Entpd1", "Cxcr5",
"Ifng"))Note that projections and comparisons are performed in the ortholog space of murine genes - to check the names of human-mouse orthologs you can examine the conversion table for genes of interest:
data(Hs2Mm.convert.table)
which.genes <- c("TCF7", "GZMB", "CD8B", "PDCD1")
Hs2Mm.convert.table[Hs2Mm.convert.table$Gene.HS %in% which.genes, ] Gene.stable.ID.HS Gene.HS Gene.MM Alt.symbol
7381 ENSG00000254126 CD8B Cd8b1 Cd8b1
7835 ENSG00000188389 PDCD1 Pdcd1 Pdcd1
11096 ENSG00000081059 TCF7 Tcf7 Tcf7
16688 ENSG00000100453 GZMB Gzmb Gzmd,Gzme,Gzmf,Gzmc,Gzmn,Gzmg,Gzmb
Alt.symbol.HS
7381 CD8B2,CD8B
7835 PDCD1
11096 TCF7
16688 GZMB
Response to therapy
Now to the interesting part. Let’s visualize the projection and T cell state distribution by response to therapy:
query.list <- SplitObject(query.projected, split.by = "Response")
pll <- list()
pll[[1]] <- plot.projection(ref, query.list[["Responder"]]) + ggtitle("Responder")
pll[[2]] <- plot.statepred.composition(ref, query.list[["Responder"]], metric = "Percent") +
ggtitle("Responder") + ylim(0, 35)
pll[[3]] <- plot.projection(ref, query.list[["Non-responder"]]) + ggtitle("Non-responder")
pll[[4]] <- plot.statepred.composition(ref, query.list[["Non-responder"]], metric = "Percent") +
ggtitle("Non-responder") + ylim(0, 35)
grid.arrange(grobs = pll, ncol = 2, nrow = 2, widths = c(1.5, 1))There is a clear shift to more differentiated states (e.g. CD8_Tex and CD8_EM) in non-responders, while responders have a higher fraction of Naive-like states.
To better examine the different in T cell phenotype composition between rsponders and non-responders, we can visualize the fold-change of T cell state frequency between the two groups:
which.types <- table(query.projected$functional.cluster) > 100
stateColors_func <- c("#edbe2a", "#A58AFF", "#53B400", "#F8766D", "#00B6EB", "#d1cfcc",
"#FF0000", "#87f6a5", "#e812dd")
states_all <- levels(factor(ref$functional.cluster))
names(stateColors_func) <- states_all
cols_use <- stateColors_func[names(which.types)][which.types]
# Responder vs non Responder
query.list <- SplitObject(query.projected, split.by = "Response")
norm.c <- table(query.list[["Non-responder"]]$functional.cluster)/sum(table(query.list[["Non-responder"]]$functional.cluster))
norm.q <- table(query.list[["Responder"]]$functional.cluster)/sum(table(query.list[["Responder"]]$functional.cluster))
foldchange <- norm.q[which.types]/norm.c[which.types]
foldchange <- sort(foldchange, decreasing = T)
tb.m <- melt(foldchange)
colnames(tb.m) <- c("Cell_state", "Fold_change")
pll <- list()
ggplot(tb.m, aes(x = Cell_state, y = Fold_change, fill = Cell_state)) + geom_bar(stat = "identity") +
scale_fill_manual(values = cols_use) + geom_hline(yintercept = 1) + scale_y_continuous(trans = "log2") +
theme(axis.text.x = element_blank(), legend.position = "left") + ggtitle("Responder vs. Non-responder")CD4 and CD8 Naive-like states are the most enriched in responders compared to non-responders, confirming the observation of the original paper that TCF7+ cells are associated with response to therapy. Additionally, we also observe an increase in CD4+ T follicular helper cells (Tfh), as well as a decrease in effector CD8+ cells. As expected, T regulatory cells (Tregs) are more prevalent in non-responders. Because Tregs are known to inhibit antitumor immunuty, this observation that could also explain the reduced tumor control in non-responding lesions.
Conclusions
Taking advantage of the ortholog mapping functionality of ProjecTILs, we have illustrated how to effortlessy analyze human scRNA-seq data in the context of a reference murine TIL atlas. Gene expression profiles confirm that T cells are accurately projected in major CD4+ and CD8+ categories, as well as in more specific subtypes. Comparing the transcriptomics profile of immune cells from melanoma patients treated with checkpoint inhibitors, and their composition in terms of cell states, confirms the observation of the original study that TCF7 is a marker and potential predictor of immunotherapy response. Additionally, ProjecTILs analysis in the context of a stable atlas gives a more complete picture of the T cell states that are “favorable” to tumor response - e.g. the enrichment of more naive-like states, and reduction of Tregs within the tumor.
Further reading
Original publication - Sade-Feldman et al. (2018) Cell
ProjecTILs case studies - INDEX - Repository